-
Notifications
You must be signed in to change notification settings - Fork 9
Major Feature Release - Antigravity Provider, Credential Prioritization & Enhanced OAuth #10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Add a new Antigravity provider and authentication base to integrate with the Antigravity (internal Google) API. - Add providers/antigravity_auth_base.py: OAuth2 token management with env/file loading, atomic saves, refresh logic, backoff/queue tracking, interactive and headless browser auth flow, and helper utilities. - Add providers/antigravity_provider.py: request/response transformations (OpenAI → Gemini CLI → Antigravity), model aliasing, thinking/reasoning config mapping, tool response grouping, streaming & non-streaming handling, and base-URL fallback. - Update provider_factory.py and providers/__init__.py to register the new provider. - Bump project metadata in pyproject.toml (package name and version). BREAKING CHANGE: project packaging metadata updated — package name changed to "rotator_library" and version bumped to 0.95. Update any dependency or packaging references that relied on the previous name/version.
…ng_content separation - Introduce Gemini 3 special mechanics in AntigravityProvider: - append a constant thoughtSignature into functionCall payloads to preserve Gemini reasoning continuity - filter out thoughtSignature parts from returned content to avoid exposing encrypted reasoning data - separate parts flagged with thought=true into a new reasoning_content field while keeping regular content in content - include thoughtsTokenCount in token accounting: prompt_tokens now includes reasoning tokens and reasoning_tokens are reported under completion_tokens_details.reasoning_tokens when present - Update comments, docstrings, and conversion logic to reflect Gemini 3 behavior - Rotate Antigravity OAuth client secret in AntigravityAuthBase
…token counting
Add a per-request file logger and reasoning configuration mapping to the Antigravity provider and expose a token counting helper.
- Introduce _AntigravityFileLogger to persist request payloads, streaming chunks, errors, and final responses under logs/antigravity_logs with timestamped directories.
- Add optional enable_request_logging kwarg to completion flow to enable per-call file logging; wire logger through streaming and non-streaming handlers.
- Log request payloads, raw response chunks, parse errors, and final unwrapped responses when enabled.
- Add _map_reasoning_effort_to_thinking_config to map reasoning_effort ('low'|'medium'|'high'|'disable'|None) to Gemini thinkingConfig for gemini-2.5 and gemini-3 families (budgets/levels and include_thoughts).
- Add count_tokens method that calls Antigravity :countTokens endpoint using transformed Gemini payloads and returns prompt/total token counts.
- Add cautionary comment about Claude parametersJsonSchema handling requiring investigation.
No behavioral breaking changes; new logging is opt-in via enable_request_logging and token counting is additive.
…budget toggle
Introduce a consolidated mapping for reasoning effort targeted at Gemini 2.5 and Gemini 3 models:
- Replace older duplicated logic with a single _map_reasoning_effort_to_thinking_config that detects gemini-2.5 vs gemini-3.
- Gemini 2.5: map reasoning_effort to model-specific thinkingBudget values (pro/flash/fallback). Default auto = -1. Apply division by 4 unless kwargs['custom_reasoning_budget'] is True.
- Gemini 3: use string thinkingLevel ("low" or "high"), default to "high" when unspecified and do not allow disabling thinking.
- Return None for non-Gemini models to avoid changing other providers (e.g., Claude).
- Propagate a new custom_reasoning_budget toggle from kwargs to the mapping call.
- Add threading and os imports and remove the old obsolete mapping implementation.
BREAKING CHANGE: Gemini 3 thinkingConfig format and defaults changed:
- thinkingLevel is now a string ("low"/"high") instead of numeric levels. Update any code that inspects thinkingConfig thinkingLevel.
- Default thinking behavior for Gemini 3 is now "high" when reasoning_effort is omitted.
- The mapping function signature/behavior changed (added custom_reasoning_budget handling). If this method was called externally, update callers to pass the new parameter or rely on kwargs propagation.
…e thoughtSignature handling for Gemini 3
- Introduce ThoughtSignatureCache: TTL-based, thread-safe, auto-cleanup cache for mapping tool_call_id → thoughtSignature.
- Integrate cache into AntigravityProvider and add env toggles:
- ANTIGRAVITY_SIGNATURE_CACHE_TTL (default 3600s)
- ANTIGRAVITY_PRESERVE_THOUGHT_SIGNATURES (client passthrough)
- ANTIGRAVITY_ENABLE_SIGNATURE_CACHE (server-side caching)
- Update message transformation to accept model and implement a 3-tier thoughtSignature fallback:
1. client-provided signature
2. server-side cache
3. bypass constant ("skip_thought_signature_validator") with warning for Gemini 3
- Fix Gemini → OpenAI chunk conversion:
- Stop dropping function calls that include signatures (skip only standalone signature parts).
- Store signatures into server cache and optionally include them in responses when passthrough is enabled.
- Robustly parse tool responses, map finish reasons, and include reasoning token counts in usage.
- Improve tool response grouping and id generation; add informative logging for signature-preservation behavior
…tSignature and decouple cache/passthrough Enforce Gemini 3 behavior where only the first tool call in parallel receives a thoughtSignature. Previously caching and client passthrough were coupled and could result in multiple signatures being stored or passed. This change: - add a first_signature_seen flag to ensure only the first tool call gets the signature - store signature in server-side cache only when _enable_signature_cache is true - pass signature to the client only when _preserve_signatures_in_client is true - preserve logging when a signature is stored in cache
…y aliasing Add "claude-sonnet-4-5" and "claude-sonnet-4-5-thinking" to HARDCODED_MODELS and simplify the alias mappings by removing explicit alias entries for these Claude models since their public names match internal names. This ensures the provider recognizes the new Claude Sonnet variants and avoids incorrect alias translations.
- Add providers/google_oauth_base.py to centralize Google OAuth logic (auth flow, token refresh, env loading, atomic saves, backoff/retry, queueing, headless support, and validation). - Migrate GeminiAuthBase and AntigravityAuthBase to inherit from GoogleOAuthBase and expose provider-specific constants (CLIENT_ID, CLIENT_SECRET, OAUTH_SCOPES, ENV_PREFIX, CALLBACK_PORT, CALLBACK_PATH). - Register "antigravity" in DEFAULT_OAUTH_DIRS and mark it as OAuth-only in credential_tool; include a user-friendly display name for interactive flows. - Remove large duplicated OAuth implementations from provider-specific files and consolidate behavior to reduce maintenance surface and ensure consistent token handling.
…_token helper Add opt-in dynamic model discovery controlled by ANTIGRAVITY_ENABLE_DYNAMIC_MODELS (default: false) to avoid relying on an unstable endpoint. When disabled, the provider returns the hardcoded model list; when enabled, it attempts to fetch models from the API and applies alias mappings. Add clear logging for enabled/disabled states and dynamic discovery results. Also introduce an async get_valid_token helper that loads credentials, refreshes expired tokens, and returns a valid access token for OAuth-style credential paths. - New env var: ANTIGRAVITY_ENABLE_DYNAMIC_MODELS (false by default) - Dynamic discovery returns discovered models prefixed with "antigravity/" - Hardcoded fallback now returns names prefixed with "antigravity/" - Added logs to indicate discovery mode and failures - Added async get_valid_token(credential_identifier) to centralize token refresh/load BREAKING CHANGE: Model names returned by the provider are now namespaced with the "antigravity/" prefix (e.g., "antigravity/xyz"). Update consumers to handle the new prefixed names or strip the prefix as needed. Dynamic discovery is disabled by default; enable it with ANTIGRAVITY_ENABLE_DYNAMIC_MODELS=true if desired.
…edential save - Handle system prompt content as either string or list and strip Claude-specific cache_control fields to avoid 400 errors - Safely parse tool content (JSON or raw) and wrap function responses consistently - Treat merged function response role as "user" to match Antigravity expectations - Add tool_call index for OpenAI streaming format and track index for parallel tool calls - Strip provider prefix from model names and add streaming query param (?alt=sse) when streaming - Include Host and User-Agent headers, set Accept based on streaming, and log error response bodies for easier debugging - Convert OpenAI-style chunks into litellm.ModelResponse objects before yielding in stream handler - Make credential persistence in Gemini CLI provider async (await _save_credentials)
…nd strip unsupported fields Remove dependency on _build_vertex_schema and align tool handling with the Go reference implementation. For function-type tools, build a function declaration with name, description, and a parametersJsonSchema field: - copy parameters when present and remove OpenAI-specific keys (`$schema`, `strict`); - default to an empty object schema when parameters are missing; - avoid mutating the original parameters and embed the declaration in `functionDeclarations`. This ensures Antigravity-compatible tool payloads and fixes schema/compatibility issues when passing tool definitions.
…mas, and fix Gemini tool conversion - Rename _normalize_json_schema → _normalize_type_arrays and convert JSON Schema "type" arrays (e.g. ["string","null"]) to a single non-null type to avoid protobuf "non-repeating" errors. - Add recursive Claude-specific schema cleaner and rename parametersJsonSchema → parameters for claude-sonnet-* models, stripping incompatible fields that break Claude validation. - Ensure thoughtSignature preservation logic remains with proper first-seen handling. - Inline generation of project/request IDs when fetching models. - Replace Vertex helper usage when building Gemini tool declarations: copy/clean parameters, set a safe default parametersJsonSchema, and call _normalize_type_arrays for compatibility.
…ignature handling to gemini-3 Add "id" to functionCall and response objects required by Antigravity/Claude integrations. Restrict preservation/insertion of thoughtSignature to Gemini 3 models only: prefer client-provided signature, fall back to the server-side cache when enabled, and finally use the bypass constant "skip_thought_signature_validator". Emit a warning when a Gemini 3 tool call lacks a signature. Avoid adding thoughtSignature for Claude and other models to prevent sending unsupported fields.
Add an environment-controlled override that modifies requests with `temperature: 0` for chat completions when `OVERRIDE_TEMPERATURE_ZERO` is enabled (default: "false"). - Supported modes: "remove" — delete the `temperature` key; "set"/"true"/"1"/"yes" — set temperature to 1.0. - Rationale: temperature=0 makes models overly deterministic and can cause tool hallucination; the override helps mitigate that when toggled. - Emits debug logs when an override is applied.
…tem-instruction) to reduce tool hallucination Introduce a configurable "Gemini 3" catch-all fix that enforces schema-driven tool usage and reduces tool hallucination by: - adding env-configurable flag ANTIGRAVITY_GEMINI3_TOOL_FIX (default ON) and related vars for prefix, description prompt, and system instruction - implementing namespace prefixing for tool names to break model training associations - injecting strict parameter signatures into tool descriptions to force schema adherence - prepending configurable system instructions for Gemini-3 models to override training-data assumptions - normalizing request/response names (prefix/strip) and preserving function call ids for API consistency - applying transformations only for gemini-3-* models and logging configuration details This change improves robustness when calling external tools by making tool schemas explicit to the model.
Implement dual-TTL caching system with async disk persistence to improve thoughtSignature handling across server restarts and long-running sessions. - Add disk persistence using atomic file writes with tempfile pattern for data integrity - Implement dual-TTL system: 1-hour memory cache, 24-hour disk cache - Create background async tasks for periodic disk writes and memory cleanup - Add disk fallback mechanism for cache misses (loads from disk into memory) - Introduce cache statistics tracking (memory hits, disk hits, misses, writes) - Add graceful shutdown with pending write flush - Convert cache operations from threading.Lock to asyncio.Lock for async support - Add environment variables for configurable write/cleanup intervals - Implement secure file permissions (0o600) for cache files - Add comprehensive logging for cache lifecycle events The cache now survives server restarts and provides better support for multi-turn conversations by persisting thoughtSignatures to disk. Memory cache expires after 1 hour to prevent unbounded growth, while disk cache persists for 24 hours to support longer conversation sessions.
… in tool args
- Extend reasoning/thinking mapping to include Claude alongside Gemini 2.5 and Gemini 3:
- Claude now uses `thinkingBudget` (same handling as Gemini 2.5, including pro budgets).
- Gemini 3 continues to use `thinkingLevel`.
- Add a static helper `_recursively_parse_json_strings` to detect and parse JSON-stringified values returned by Antigravity (e.g., `{"files": "[{...}]"}`) and recursively restore proper structures.
- Use parsed arguments before `json.dumps()` when building tool call payloads to prevent double-encoding and JSON parsing errors from Antigravity responses.
- Update .gitignore to add `launcher_config.json` and `cache/antigravity/thought_signatures.json` and remove the previous `*.log` ignore entry.
…ravity cache handling - Split the single signature cache into separate files: `GEMINI3_SIGNATURE_CACHE_FILE` and `CLAUDE_THINKING_CACHE_FILE`. - Replace `ThoughtSignatureCache` with `AntigravityCache`; disk persistence file is now passed via a `cache_file` constructor argument and in-memory entries are keyed by generic cache keys. - Introduce a stable key generator (`_generate_thinking_cache_key`) that combines tool call IDs and text hashes for Claude thinking caching. - Add separate caches for Gemini 3 signatures (`_signature_cache`) and Claude thinking content (`_thinking_cache`), and wire caching into both streaming and non-streaming flows. - Accumulate reasoning content, tool calls, and the final `thoughtSignature` during streaming (via `stream_accumulator`) and persist complete Claude thinking after the stream (`_cache_claude_thinking_after_stream`). - Inject cached Claude "thinking" parts into assistant messages when available (with signature fallback handling). - Use tool-provided IDs when present (fall back to generated `call_<uuid>` IDs), fix skipping logic for signature-only parts, and accumulate tool calls/text for reliable cache keys. - Adjust reasoning budget division from `// 4` to `// 6` to reduce default thinking budget. - Update `_gemini_to_openai_chunk` signature to accept an optional `stream_accumulator` and propagate accumulator through streaming logic. BREAKING CHANGE: `ThoughtSignatureCache` has been removed/renamed to `AntigravityCache` and its constructor now requires a `cache_file: Path` argument. Update any external imports/usages: - Replace `ThoughtSignatureCache(...)` with `AntigravityCache(cache_file=GEMINI3_SIGNATURE_CACHE_FILE|CLAUDE_THINKING_CACHE_FILE, memory_ttl_seconds=..., disk_ttl_seconds=...)`. - New cache constants `GEMINI3_SIGNATURE_CACHE_FILE` and `CLAUDE_THINKING_CACHE_FILE` were added; ensure integrations use the new names if relying on disk cache paths.
… tier-based onboarding This commit refactors the project discovery logic to strictly follow the official Gemini CLI behavior, fixing critical issues with paid tier support and free tier onboarding. Key changes: - Implement proper discovery flow: cache → configured override → persisted credentials → loadCodeAssist check → tier-based onboarding → fallback - Fix paid tier support: paid tiers now correctly use configured project_id instead of server-managed projects - Fix free tier onboarding: free tier correctly passes cloudaicompanionProject=None for server-managed projects - Add comprehensive tier detection logic: check currentTier from server response and respect userDefinedCloudaicompanionProject flag - Improve error handling: add specific error messages for 412 (precondition failed) and better guidance for missing project_id on paid tiers - Add detailed debug logging: log all tier information, server responses, and decision flow for troubleshooting - Add paid tier visibility: log paid tier usage on each request for transparency - Remove noisy debug logging: disable verbose chunk conversion logs The previous implementation incorrectly assumed all users should use server-managed projects and failed to properly distinguish between free tier (server-managed) and paid tier (user-provided) project handling. This caused 403/412 errors for paid users and incorrect onboarding flow for free users.
… organization and documentation This is a major refactoring of the Antigravity provider implementation that significantly improves code structure, readability, and maintainability without changing functionality. Key improvements: - Reorganized code into logical sections with clear separators (configuration, utilities, caching, transformations, API interface) - Consolidated helper functions with consistent naming patterns (underscore prefix for internal methods) - Simplified complex methods by extracting reusable components (e.g., _parse_content_parts, _extract_tool_call, _format_type_hint) - Enhanced documentation with comprehensive module docstring explaining features and capabilities - Streamlined environment variable handling with dedicated helper functions (_env_bool, _env_int) - Improved type hints and method signatures for better IDE support - Reduced code duplication in message transformation logic - Consolidated tool schema transformations into focused methods - Better separation of concerns between streaming and non-streaming response handling - Standardized error handling and logging patterns - Improved cache implementation with clearer separation of responsibilities The refactoring maintains full backward compatibility while making the codebase significantly easier to understand, test, and extend. All existing features including Gemini 3 thoughtSignature preservation, Claude thinking caching, tool hallucination prevention, and base URL fallback remain fully functional.
…module Extracted the AntigravityCache class into a new shared ProviderCache module to eliminate code duplication and improve maintainability across providers. - Created src/rotator_library/providers/provider_cache.py with generic, reusable cache implementation - Removed 266 lines of cache-specific code from antigravity_provider.py - Updated AntigravityProvider to use ProviderCache for both signature and thinking caches - Added configurable env_prefix parameter for flexible environment variable namespacing - Improved cache naming with _cache_name for better logging context - Added convenience factory function create_provider_cache() for streamlined cache creation - Removed unused imports (shutil, tempfile) from antigravity_provider.py - Updated .gitignore to include cache/ directory The new ProviderCache maintains full backward compatibility with the previous AntigravityCache implementation while providing a more modular, reusable foundation for other providers.
…automatic -thinking mapping This commit streamlines the handling of Claude Sonnet 4.5 model variants by automatically mapping the base model to its -thinking variant when reasoning_effort is provided. - Remove explicit "claude-sonnet-4-5-thinking" from AVAILABLE_MODELS list - Add inline documentation explaining internal mapping behavior - Implement automatic model variant selection in _transform_to_antigravity_format based on reasoning_effort parameter - Thread reasoning_effort parameter through generate_content call chain - Check for base claude-sonnet-4-5 model and append "-thinking" suffix when reasoning_effort is present This improves the API surface by reducing redundant model options while maintaining full functionality through intelligent runtime model selection.
…ure caching This commit integrates comprehensive support for `gemini-3-pro-preview`, addressing specific requirements for reasoning models and tool reliability. - Update `AntigravityProvider` and `GeminiCliProvider` model lists to prioritize Gemini 3. - Implement a "Tool Fix" mechanism to prevent parameter hallucinations: - Inject strict parameter signatures and type hints into tool descriptions. - Add specific system instructions to enforce schema adherence. - Apply `gemini3_` namespace prefixing to isolate tool contexts. - Integrate `ProviderCache` to persist `thoughtSignature` values, ensuring reasoning continuity during tool execution. - Refactor `_handle_reasoning_parameters` to support Gemini 3's `thinkingLevel` (string) alongside Gemini 2.5's `thinkingBudget` (integer). - Add environment variable configuration for cache TTL and feature flags.
…quest payload The `model` and `project` parameters were being incorrectly included at the top level of the request payload. These fields are not part of the Gemini API request body structure and should only be used for endpoint construction or authentication context.
…g for Antigravity - Change reasoning parameters log from info to debug level in main.py - Move reasoning parameters logging outside logger conditional block for consistent monitoring - Enhance _clean_claude_schema documentation to clarify it's for Antigravity/Google's Proto-based API - Add support for converting 'const' to 'enum' with single value in schema cleaning - Improve code organization with better comments explaining unsupported fields These changes improve logging granularity and enhance JSON Schema compatibility with Antigravity's Proto-based API requirements.
…model switches This commit introduces intelligent handling of Claude's thinking mode when switching models mid-conversation during incomplete tool use loops. **New Features:** - Auto-detection of incomplete tool turns (when messages end with tool results without assistant completion) - Configurable turn completion injection via `ANTIGRAVITY_AUTO_INJECT_TURN_COMPLETION` (default: true) - Configurable thinking mode suppression via `ANTIGRAVITY_AUTO_SUPPRESS_THINKING` (default: false) - Customizable turn completion placeholder text via `ANTIGRAVITY_TURN_COMPLETION_TEXT` (default: "...") **Implementation Details:** - `_detect_incomplete_tool_turn()`: Analyzes message history to identify incomplete tool use patterns - `_inject_turn_completion()`: Appends a synthetic assistant message to close incomplete turns - `_handle_thinking_mode_toggle()`: Orchestrates the toggling strategy based on configuration **Behavior:** When switching to Claude with thinking mode enabled during an incomplete tool loop: 1. If auto-injection is enabled: Inject a completion message to allow thinking mode 2. If auto-suppression is enabled: Disable thinking mode to prevent API errors 3. If both disabled: Allow the request to proceed (likely resulting in API error) This resolves API compatibility issues when transitioning between models with different conversation state requirements.
The generic key handling logic was incorrectly concatenating the 'role' field when processing streaming message chunks. The role field should always be replaced with the latest value, not concatenated like content fields. This fix adds an explicit check to ensure the 'role' key is always overwritten rather than appended to, preventing malformed role values in the final message object.
Antigravity sometimes returns malformed JSON strings with extra trailing characters (e.g., '[{...}]}' instead of '[{...}]'). This enhancement extends the JSON parsing logic to automatically detect and correct such malformations by:
- Detecting JSON-like strings that don't have proper closing delimiters
- Finding the last valid closing bracket/brace and truncating extra characters
- Logging warnings when auto-correction is applied for debugging purposes
- Recursively parsing the corrected JSON structures
This prevents parsing failures when Antigravity returns double-encoded or malformed JSON in tool arguments.
…dentials The `_get_provider_instance` method now checks if credentials exist for a provider before attempting initialization. This prevents potential errors from initializing providers that lack proper configuration. - Added credential existence check at the start of the method - Returns `None` early if provider credentials are not configured - Added debug logging to indicate when provider initialization is skipped - Enhanced docstring with detailed Args and Returns documentation This change improves system robustness by failing gracefully when providers are referenced but not properly configured.
This commit removes the thinking mode toggling functionality that was previously used to handle model switches mid-conversation when tool use loops were incomplete. - Removed `_detect_incomplete_tool_turn`, `_inject_turn_completion`, and `_handle_thinking_mode_toggle` helper methods - Removed environment variable configuration for turn completion behavior (`ANTIGRAVITY_AUTO_INJECT_TURN_COMPLETION`, `ANTIGRAVITY_AUTO_SUPPRESS_THINKING`, `ANTIGRAVITY_TURN_COMPLETION_TEXT`) - Removed thinking mode toggle logic from `acompletion` method - Added provider prefix to JSON auto-correction warning log for better debugging The removed feature was designed to automatically handle incomplete tool use loops when switching to Claude models with thinking mode enabled, but was buggy as hell.
… failures This commit improves the robustness of OAuth token refresh operations in both IFlowAuthBase and QwenAuthBase by implementing failure tracking with exponential backoff and credential validation. - Track refresh failures per credential path using `_refresh_failures` dictionary - Implement exponential backoff (30s * 2^failures, max 5 minutes) to prevent rapid retry loops on persistent failures - Clear backoff state on successful authentication or refresh - Add validation to ensure refreshed credentials contain required fields (access_token, refresh_token, and api_key for iFlow) - Update proactively_refresh to support env:// virtual paths for environment-based OAuth credentials - Add detailed debug logging for backoff timer settings The backoff mechanism prevents excessive API calls when refresh tokens are invalid or services are temporarily unavailable, while the validation ensures credential integrity after refresh operations.
…alization in stream reassembly
This commit addresses critical issues in the streaming response reassembly logic across multiple providers (Gemini CLI, iFlow, and Qwen Code):
- Implements priority-based finish_reason determination: tool_calls > chunk's finish_reason (length, content_filter, etc.) > stop
- Properly initializes aggregated_tool_calls with "type": "function" field for OpenAI compatibility
- Tracks chunk_finish_reason separately to preserve provider-specific finish reasons (e.g., content_filter, length limits)
- Uses safer .get("index", 0) for tool call index extraction to prevent KeyErrors
- Adds explicit type field handling during tool call aggregation
- Improves docstring documentation explaining the reassembly logic
- Moves copy import to top-level in iflow_provider.py and qwen_code_provider.py for consistency
CRITICAL FIX for qwen_code_provider.py: Handles chunks with BOTH usage and choices data (typical for final chunk) without early return, ensuring finish_reason is properly captured before yielding usage data separately.
The .env file was being loaded after attempting to read PROXY_API_KEY from environment variables, causing the key to be unavailable for display during startup. Moving the dotenv.load_dotenv() call earlier in the initialization sequence ensures environment variables are loaded before they are accessed.
Introduces a comprehensive provider-specific settings management system for Antigravity and Gemini CLI providers with detection, display, and interactive configuration capabilities. - Add `PROVIDER_SETTINGS_MAP` with detailed definitions for Antigravity (12 settings) and Gemini CLI (8 settings) including signature caching, tool fixes, and provider-specific parameters - Implement `ProviderSettingsManager` class for managing provider settings with type-aware value parsing and modification tracking - Add `detect_provider_settings()` method to `SettingsDetector` to identify modified provider settings from environment variables - Integrate provider settings detection into launcher TUI summary display and detailed advanced settings view - Add new menu option (4) in settings tool for provider-specific configuration management - Implement interactive TUI for browsing, editing, and resetting individual or all provider settings with visual indication of modified values - Display provider settings status in launcher with count of modified settings per provider - Support bool, int, and string setting types with appropriate input handling and validation
Restructured the Antigravity provider description in the README for better clarity and readability: - Converted the dense paragraph into a structured bullet list highlighting key features - Separated thought signature caching, tool hallucination prevention, and thinking block sanitization into distinct points - Replaced the informal troubleshooting note with a concise reference to dedicated documentation - Added direct link to Antigravity documentation section for Claude extended thinking sanitization details This change improves the discoverability of Antigravity's advanced features and provides a clearer path for users to understand Claude Sonnet 4.5 thinking mode limitations.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Important
Looks good to me! 👍
Reviewed efbd008 in 1 minute and 36 seconds. Click for details.
- Reviewed
17lines of code in1files - Skipped
0files when reviewing. - Skipped posting
1draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. README.md:31
- Draft comment:
Refined the Antigravity Provider description – the bullet list clearly outlines advanced features and removes informal language. Verify that the linked documentation covers all details on Claude Sonnet 4.5 state management. - Reason this comment was not posted:
Confidence changes required:0%<= threshold1%None
Workflow ID: wflow_WenuYCxlyW35jJYX
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
…limit - Add `*.env` to `.gitignore` to prevent accidentally committing environment variables containing sensitive data - Increase `DEFAULT_MAX_OUTPUT_TOKENS` from 16384 to 32384 in Antigravity provider to allow for longer model outputs
…ding and bulk export tools This commit introduces comprehensive support for loading OAuth credentials from environment variables alongside file-based credentials, and adds powerful bulk export/combine functionality for all credential types. Main changes: - **Environment-based credentials**: Modified main.py to load all *.env files from the root directory, enabling credentials to be stored in environment variables with an "env://" virtual path scheme - **Safe metadata handling**: Added checks throughout to skip file I/O operations for env-based credentials (they use virtual paths and don't have metadata files) - **Optimized credential discovery**: Updated RotatingClient to accept pre-discovered credentials from main.py, avoiding redundant discovery calls - **Bulk export tools**: Added `export_all_provider_credentials()` to export all credentials for a specific provider to individual .env files - **Credential combining**: Added `combine_provider_credentials()` to merge all credentials for a provider into a single .env file, and `combine_all_credentials()` to create one master .env file with all providers - **Enhanced export menu**: Expanded the credential export submenu with 13 options covering individual exports, bulk exports per provider, and various combining strategies - **Provider support**: Added helper functions `_build_gemini_cli_env_lines()`, `_build_qwen_code_env_lines()`, `_build_iflow_env_lines()`, and `_build_antigravity_env_lines()` for consistent .env file generation These changes enable flexible credential management, allowing users to store credentials as files or environment variables, and providing powerful tools to export and combine credentials for deployment scenarios.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Caution
Changes requested ❌
Reviewed bd8f638 in 2 minutes and 56 seconds. Click for details.
- Reviewed
473lines of code in3files - Skipped
0files when reviewing. - Skipped posting
7draft comments. View those below. - Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.
1. src/rotator_library/client.py:66
- Draft comment:
Rotation tolerance parameter set to 3.0 appears appropriate. Ensure that documentation (e.g. README) clearly explains recommended ranges for production use. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
2. src/rotator_library/client.py:752
- Draft comment:
Credential prioritization logic using get_model_tier_requirement and get_credential_priority is well integrated. Consider clarifying (in comments or docs) the behavior when a credential’s priority is unknown (None). - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
3. src/rotator_library/client.py:1195
- Draft comment:
The streaming completion retry logic similarly applies model tier filtering. Consider refactoring common filtering logic between streaming and non‐streaming paths to reduce code duplication. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
4. src/rotator_library/credential_tool.py:50
- Draft comment:
The helper _build_env_export_content is well structured for generating .env lines. It may help to sanitize values (e.g. email addresses) to avoid issues with special characters. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
5. src/rotator_library/credential_tool.py:110
- Draft comment:
In ensure_env_defaults, a default PROXY_API_KEY ('VerysecretKey') is set. Ensure that users are clearly warned that this default is insecure for production environments. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
6. src/rotator_library/credential_tool.py:130
- Draft comment:
The hardcoded provider list in setup_api_key is extensive. Consider externalizing these settings (e.g. in a config file) for easier updates and maintenance. - Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.
7. src/rotator_library/credential_tool.py:792
- Draft comment:
The export and combine credential functions offer flexible export options. Consider adding additional error handling around file I/O operations to catch and report permission or disk errors. - Reason this comment was not posted:
Comment looked like it was already resolved.
Workflow ID: wflow_Pm3cUd40zCdw7kR3
You can customize by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.
| except Exception as e: | ||
| logging.error(f"Failed to update metadata for '{path}': {e}") | ||
| # Update metadata (skip for env-based credentials - they don't have files) | ||
| if not path.startswith("env://"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider defining a constant for 'env://' to improve maintainability when skipping metadata update for env-based credentials.
Introduces a new model information service that fetches pricing and capability data from external catalogs (OpenRouter and Models.dev) to enrich the /v1/models endpoint and enable cost estimation.
- Implements ModelRegistry class with async background data fetching to avoid blocking proxy startup
- Adds fuzzy model ID matching with multi-source data aggregation
- Expands /v1/models endpoint with optional enriched response containing pricing, token limits, and capability flags
- Adds new endpoints: GET /v1/models/{model_id}, GET /v1/model-info/stats, POST /v1/cost-estimate
- Supports per-token pricing for input, output, cache read, and cache write operations
- Integrates with lifespan management for proper service initialization and cleanup
- Includes comprehensive backward compatibility layer for gradual migration
The service refreshes data every 6 hours (configurable via MODEL_INFO_REFRESH_INTERVAL) and runs asynchronously to maintain fast proxy initialization times.
|
Generated with ❤️ by ellipsis.dev |
…validation Enhanced the Gemini 3 system instruction with more comprehensive and explicit rules for tool parameter usage to prevent hallucination and schema mismatches. - Rewrote DEFAULT_GEMINI3_SYSTEM_INSTRUCTION with clearer structure and XML-style tags for better model parsing - Added explicit warnings about pre-trained tool knowledge being invalid in custom environments - Included detailed guidance on array parameters, nested objects, and common failure patterns - Enhanced _clean_claude_schema to handle 'anyOf' and 'oneOf' by selecting the first option (Claude doesn't support these constructs) - Added temperature parameter handling with explicit Gemini 3 default of 1.0 for better tool use performance These changes address recurring issues where the model would use parameter names from its training data instead of reading the actual JSON schema definitions, particularly for tools with array-of-objects parameters.
…l calls This commit introduces a comprehensive strict schema enforcement mechanism to prevent Gemini 3 models from hallucinating parameters not defined in tool schemas. - Add new `_enforce_strict_schema()` method that recursively adds `additionalProperties: false` to all object schemas in tool definitions - Introduce `ANTIGRAVITY_GEMINI3_STRICT_SCHEMA` environment variable (defaults to True) to control strict schema enforcement - Enhance `_format_type_hint()` to provide more detailed parameter type information including enum values, const values, nested objects, and recursive type hints - Update Gemini 3 description prompt with explicit warning against using parameters from training data - Integrate strict schema enforcement into the Gemini 3 tool transformation pipeline - Add strict schema configuration to debug logging output The strict schema enforcement tells the model it cannot add properties not explicitly defined in the schema, significantly reducing parameter hallucination issues. The enhanced type hints provide clearer guidance to the model about expected parameter formats.
|
Generated with ❤️ by ellipsis.dev |
…improve Gemini 3 tool call reliability This commit addresses issues with schema compatibility and tool call hallucination across providers: - **Antigravity Provider**: Expands the list of incompatible JSON Schema keywords that must be filtered out for Claude via Antigravity, including validation constraints (minLength, maxLength, minimum, maximum), metadata fields (title, examples, deprecated), and JSON Schema draft 2020-12 specific keywords that cause API rejections. - **Gemini CLI Provider**: Significantly enhances the Gemini 3 tool calling system to prevent parameter hallucination: - Rewrites system instruction with more explicit warnings about custom tool schemas differing from training data - Adds common failure pattern examples to help the model avoid typical mistakes - Implements strict schema enforcement via `additionalProperties: false` to prevent invalid parameter injection - Improves parameter signature hints in tool descriptions with recursive type formatting, enum/const support, and nested object display - Adds new environment variable `GEMINI_CLI_GEMINI3_STRICT_SCHEMA` to control strict schema enforcement - Enhances type hint formatting to show array-of-objects structures more clearly These changes work together to reduce tool call errors by making schema constraints more explicit to both the Antigravity API and the Gemini 3 model.
|
Generated with ❤️ by ellipsis.dev |
|
Generated with ❤️ by ellipsis.dev |
…ployment logs Removes verbose DEBUG-REMOVE diagnostic print statements that were used for troubleshooting .env loading and credential discovery during development. - Removes ~25 debug print statements from main.py and credential_manager.py - Adds concise, production-friendly logging for deployment verification: - .env file loading summary with file names - Credential loading summary with provider:count format - Preserves essential startup information for operational visibility - Improves code readability by removing debugging clutter - Maintains helpful deployment context without verbose diagnostic output
The project metadata loading and persistence logic was attempting to perform file I/O operations on env:// credential paths, which represent environment-based credentials rather than file-based ones. This caused unnecessary file operation errors. - Add checks using `_parse_env_credential_path()` to detect env:// paths before attempting file operations - Skip loading persisted project metadata from files for env:// credentials - Skip persisting project metadata to files for env:// credentials - Add debug logging to indicate when persistence is being skipped for env:// paths This prevents FileNotFoundError exceptions and improves reliability when using environment-based credential configuration.
Changed all `is_ready()` method calls to `is_ready` property access in the model_info_service across three endpoint functions: - list_models endpoint for enriched model data - get_model endpoint for model information retrieval - cost_estimate endpoint for cost calculation This aligns with the service's implementation where is_ready is exposed as a property rather than a callable method.
|
Generated with ❤️ by ellipsis.dev |
📋 Summary
This PR introduces significant enhancements to the LLM API Key Proxy, including a new Antigravity provider with Gemini 3 support, intelligent credential prioritization, configurable rotation strategies, and a refactored OAuth architecture. These changes improve security, reliability, and expand model support while maintaining backward compatibility.
🎯 Major Features
1. 🚀 Antigravity Provider (New)
The most sophisticated provider implementation to date, supporting Google's internal Antigravity API with full support for cutting-edge models:
Supported Models:
thinkingBudgetparameterAdvanced Features:
Configuration: Full OAuth 2.0 support with stateless deployment capability
File Logging: Optional transaction logging for debugging
Files Added:
src/rotator_library/providers/antigravity_provider.py(1,616 lines)src/rotator_library/providers/antigravity_auth_base.py2. 🎯 Credential Prioritization System
Intelligent credential tier detection and priority-based selection ensures optimal credential usage:
get_credential_priority()to return priority levels (1=highest, 10=lowest)get_model_tier_requirement()to specify minimum priority for modelsExample Implementation (Gemini CLI):
Benefits:
3. 🎲 Weighted Random Rotation
Configurable credential rotation strategy for enhanced security and unpredictability:
rotation_toleranceParameter (default: 3.0):0.0: Deterministic - always selects least-used credential (perfect balance)2.0-4.0(recommended): Weighted random with bias toward less-used credentials5.0+: High randomness for maximum unpredictabilityFormula:
weight = (max_usage - credential_usage) + tolerance + 1Security Benefits:
Configuration:
4. 🔧 Enhanced Gemini CLI Provider
Significant improvements to Gemini CLI authentication and model support:
Improved Project Discovery:
GEMINI_CLI_PROJECT_IDoverrideGemini 3 Support:
thinkingLevelconfigurationCredential Prioritization: Automatic paid vs free tier detection and priority assignment
5. 🗄️ Provider Cache System (New)
Modular, shared caching system for provider conversation state:
Architecture:
Key Methods:
store()/store_async(): Synchronous/async storageretrieve()/retrieve_async(): With disk fallbackUse Cases:
Files Added:
src/rotator_library/providers/provider_cache.py(498 lines)6. 🔐 Refactored OAuth Architecture
Shared OAuth base class eliminates code duplication:
GoogleOAuthBaseClass: Single source of truth for all OAuth logicBenefits:
Inherited Features:
Refactored Providers:
GeminiAuthBase→ extendsGoogleOAuthBaseAntigravityAuthBase→ extendsGoogleOAuthBaseFiles Added:
src/rotator_library/providers/google_oauth_base.py(653 lines)7. 🌡️ Temperature Override
Global temperature=0 override to prevent tool hallucination:
Modes:
"remove": Deletes temperature=0 from requests"set": Changes temperature=0 to temperature=1.0"false": Disabled (default)Configuration:
8. 🛠️ Tool Improvements
Enhanced Credential Tool:
.envformatUpdated Launcher:
📝 Documentation Updates
Major Documentation Changes:
DOCUMENTATION.md:README.md:Deployment guide.md:src/rotator_library/README.md:🔧 Technical Changes
Client (
client.py)rotation_toleranceparameter to constructor_make_completion_requestandacompletion_streamUsage Manager (
usage_manager.py)_select_weighted_random()methodacquire_key()with priority group supportcredential_prioritiesparameterProvider Interface (
provider_interface.py)get_credential_priority()methodget_model_tier_requirement()methodCredential Manager (
credential_manager.py)DEFAULT_OAUTH_DIRSFactory (
provider_factory.py)AntigravityAuthBaseproviderProxy App (
main.py)🐛 Bug Fixes
📦 Dependencies & Configuration
New Configuration Options:
Antigravity Provider:
Rotation Strategy:
Temperature Override:
Updated
.gitignore:*.logexclusion (to allow log directories)launcher_config.jsoncache/directory🚀 Migration Guide
For Existing Users:
No Breaking Changes: All existing functionality remains backward compatible
Optional New Features:
rotation_tolerancefor weighted random rotation (recommended: 3.0)Gemini CLI Users:
GEMINI_CLI_PROJECT_IDif using paid tierNew Environment Variables (all optional):
ROTATION_TOLERANCE(default: 0.0)OVERRIDE_TEMPERATURE_ZERO(default: false)📊 Statistics
Important
This PR adds an Antigravity provider, credential prioritization, and enhanced OAuth architecture to the LLM API Key Proxy, supporting Gemini 3 models and implementing a weighted random rotation strategy.
antigravity_provider.pywith Gemini 3 support and advanced features like thought signature caching and tool hallucination prevention.get_credential_priority()andget_model_tier_requirement()for intelligent credential selection.usage_manager.pyfor enhanced security and unpredictability.rotation_toleranceparameter controls randomness.google_oauth_base.pyfor shared logic across providers.provider_cache.pyfor shared caching system.client.pyandmain.pyfor new features and configurations.DOCUMENTATION.mdandREADME.md.This description was created by
for bd8f638. You can customize this summary. It will automatically update as commits are pushed.